"Good" and "Bad" Diversity in Majority Vote Ensembles
نویسندگان
چکیده
Although diversity in classifier ensembles is desirable, its relationship with the ensemble accuracy is not straightforward. Here we derive a decomposition of the majority vote error into three terms: average individual accuracy, “good” diversity and “bad diversity”. The good diversity term is taken out of the individual error whereas the bad diversity term is added to it. We relate the two diversity terms to the majority vote limits defined previously (the patterns of success and failure). A simulation study demonstrates how the proposed decomposition can be used to gain insights about majority vote classifier ensembles.
منابع مشابه
Examining the Relationship Between Majority Vote Accuracy and Diversity in Bagging and Boosting
Much current research is undertaken into combining classifiers to increase the classification accuracy. We show, by means of an enumerative example, how combining classifiers can lead to much greater or lesser accuracy than each individual classifier. Measures of diversity among the classifiers taken from the literature are shown to only exhibit a weak relationship with majority vote accuracy. ...
متن کاملExamining the Relationship Between Majority Vote Ac - curacy and Diversity in Bagging and
Much current research is undertaken into combining classifiers to increase the classification accuracy. We show, by means of an enumerative example, how combining classifiers can lead to much greater or lesser accuracy than each individual classifier. Measures of diversity among the classifiers taken from the literature are shown to only exhibit a weak relationship with majority vote accuracy. ...
متن کاملOn Voting Ensembles of Classiiers (extended Abstract)
We study the classiication ability of majority-vote ensembles of classiiers. A majority ensemble classiies a pattern by letting each member of the ensemble cast a single vote for the correct class and decides according to a simple majority or a special majority vote. We give upper and lower bounds on the classiication performance of a majority ensemble as a function of the classiication perform...
متن کاملEnsembles of nearest neighbour classifiers and serial analysis of gene expression
In this paper, we represent experimental results obtained with ensembles of nearest neighbour classifiers on the binary classification problem of cancer classification using serial analysis of gene expression (SAGE) data. Nearest neighbours are selected as classifiers since they were rarely employed in building ensembles because their predictions are stable to small perturbations of data, which...
متن کاملOnline approach to handle concept drifting data streams using diversity
Concept drift is the trend observed in almost all real time applications. Many online and offline algorithms were developed in the past to analyze this drift and train our algorithms. Different levels of diversity are required before and after a drift to get the best generalization accuracy. In our paper, we present a new online approach Extended Dynamic Weighted Majority with diversity (EDWM) ...
متن کامل